智能论文笔记

临床票据是记录患者信息的有效方法，但难以破译非专家的难以破译。自动简化医学文本可以使患者提供有关其健康的有价值的信息，同时节省临床医生。我们提出了一种基于词频率和语言建模的医学文本自动简化的新方法，基于富裕的外行术语的医疗本体。我们发布了一对公开可用的医疗句子的新数据集，并由临床医生简化了它们的版本。此外，我们定义了一种新颖的文本简化公制和评估框架，我们用于对我们对现有技术的方法进行大规模人类评估。我们基于在医学论坛数据上培训的语言模型的方法在保留语法和原始含义时产生更简单的句子，超越现有技术。

translated by 谷歌翻译

Empirical Asset Pricing via Ensemble Gaussian Process Regression

Damir Filipović , Puneet Pasricha

分类：机器学习

2022-12-02

We introduce an ensemble learning method based on Gaussian Process Regression (GPR) for predicting conditional expected stock returns given stock-level and macro-economic information. Our ensemble learning approach significantly reduces the computational complexity inherent in GPR inference and lends itself to general online learning tasks. We conduct an empirical analysis on a large cross-section of US stocks from 1962 to 2016. We find that our method dominates existing machine learning models statistically and economically in terms of out-of-sample $R$-squared and Sharpe ratio of prediction-sorted portfolios. Exploiting the Bayesian nature of GPR, we introduce the mean-variance optimal portfolio with respect to the predictive uncertainty distribution of the expected stock returns. It appeals to an uncertainty averse investor and significantly dominates the equal- and value-weighted prediction-sorted portfolios, which outperform the S&P 500.

translated by 谷歌翻译